Distributed training of Large-scale Logistic models

نویسندگان

  • Siddharth Gopal
  • Yiming Yang
چکیده

Regularized Multinomial Logistic regression has emerged as one of the most common methods for performing data classification and analysis. With the advent of large-scale data it is common to find scenarios where the number of possible multinomial outcomes is large (in the order of thousands to tens of thousands) and the dimensionality is high. In such cases, the computational cost of training logistic models or even simply iterating through all the model parameters is prohibitively expensive. In this paper, we propose a training method for large-scale multinomial logistic models that breaks this bottleneck by enabling parallel optimization of the likelihood objective. Our experiments on large-scale datasets showed an order of magnitude reduction in training time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Distributed Newton Method for Regularized Logistic Regression

Regularized logistic regression is a very successful classification method, but for large-scale data, its distributed training has not been investigated much. In this work, we propose a distributed Newton method for training logistic regression. Many interesting techniques are discussed for reducing the communication cost. Experiments show that the proposed method is faster than state of the ar...

متن کامل

Revisiting Large Scale Distributed Machine Learning

Nowadays, with the widespread of smartphones and other portable gadgets equipped with a variety of sensors, data is ubiquitous available and the focus of machine learning has shifted from being able to infer from small training samples to dealing with large scale high-dimensional data. In domains such as personal healthcare applications, which motivates this survey, distributed machine learning...

متن کامل

Large Scale Image Classification

We consider Multinomial Logistic Regression for large scale image classification. The model is trained using 1,000,000 images from Image Net Large Scale Visual Recognition Challenge 2010[2] . We train five models over different subset of sampled training observations. Finally we combine all the five models to obtain the test data classification. The combined classifier gives good performance. W...

متن کامل

Distributed Newton Methods for Regularized Logistic Regression

Regularized logistic regression is a very useful classification method, but for large-scale data, its distributed training has not been investigated much. In this work, we propose a distributed Newton method for training logistic regression. Many interesting techniques are discussed for reducing the communication cost and speeding up the computation. Experiments show that the proposed method is...

متن کامل

Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models

Training conditional maximum entropy models on massive data sets requires significant computational resources. We examine three common distributed training methods for conditional maxent: a distributed gradient computation method, a majority vote method, and a mixture weight method. We analyze and compare the CPU and network time complexity of each of these methods and present a theoretical ana...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013